Skip to content

MerkleTransport trait for client <> server#514

Open
malcolmgreaves wants to merge 1 commit intomg/merkle_dyn_use_file_backendfrom
mg/merkle_dyn_pack_interfaces
Open

MerkleTransport trait for client <> server#514
malcolmgreaves wants to merge 1 commit intomg/merkle_dyn_use_file_backendfrom
mg/merkle_dyn_pack_interfaces

Conversation

@malcolmgreaves
Copy link
Copy Markdown
Collaborator

Adds new traits for defining how MerkleStores can encode Merkle tree
nodes to send and receive between the oxen-cli client and oxen-server.
The MerklePacker trait collects a subset of nodes and allows a writer to
encode them.

The MerkleUnpacker trait uses a provided reader to decode Merkle tree
nodes and update the underlying physical store. Provided that they have the
same Error type, combining these two traits is a MerkleTransport. A
blanket impl for any such unified traits is included here too.

Also defines a TransportableMerkleStore, which is a MerkleStore + a
MerkleTransport. A blanket impl. is provided to unify, so long as the
type's implementation's Errors are all the same.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 1, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds a merkle-tree wire-transport abstraction: new merkle_transport module with MerklePacker, MerkleUnpacker, MerkleTransport, PackOptions, UnpackOptions; re-exported from merkle_tree.rs. Introduces TransportableMerkleStore marker trait with a blanket ?Sized impl.

Changes

Merkle Tree Transport Abstraction

Layer / File(s) Summary
Configuration & Options
crates/lib/src/model/merkle_tree/merkle_transport.rs
Adds PackOptions (ServerCanonical, LegacyClientPush) and UnpackOptions (Overwrite, SkipExisting).
Core Transport Traits
crates/lib/src/model/merkle_tree/merkle_transport.rs
Adds MerklePacker (pack_nodes, pack_all) and MerkleUnpacker (unpack) traits operating on tar-gz streams (&mut dyn Write / &mut dyn Read).
Marker Trait & Blanket Impl
crates/lib/src/model/merkle_tree/merkle_transport.rs
Adds MerkleTransport super-trait and blanket impl impl<T: MerklePacker + MerkleUnpacker + ?Sized> MerkleTransport for T {}.
Module Integration & Re-exports
crates/lib/src/model/merkle_tree.rs
Declares pub mod merkle_transport; and re-exports MerklePacker, MerkleTransport, MerkleUnpacker, PackOptions, UnpackOptions.
Store Integration
crates/lib/src/model/merkle_tree.rs
Adds TransportableMerkleStore: MerkleStore + MerkleTransport and blanket impl impl<T: MerkleStore + MerkleTransport + ?Sized> TransportableMerkleStore for T {}.

Sequence Diagram

sequenceDiagram
    participant Client as Client
    participant Packer as MerklePacker
    participant Transport as Stream
    participant Unpacker as MerkleUnpacker
    participant Store as MerkleStore

    Client->>Packer: request pack_nodes(hashes, PackOptions)
    Packer->>Transport: write tar-gz stream (nodes)
    Transport->>Unpacker: provide stream
    Unpacker->>Store: install nodes (UnpackOptions)
    Unpacker-->>Client: return HashSet<MerkleHash>
Loading

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

🐰
I bundled hashes, snug and tight,
in tar-gz dreams for day or night.
Pack, send, unpack — a tidy art,
now merkle hops from part to part.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title directly describes the main change—introducing MerkleTransport trait for client-server communication.
Description check ✅ Passed The description thoroughly explains the traits added (MerklePacker, MerkleUnpacker, MerkleTransport, TransportableMerkleStore) and their purpose for encoding/decoding Merkle nodes.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch mg/merkle_dyn_pack_interfaces

Comment @coderabbitai help to get the list of available commands and usage tips.

@malcolmgreaves malcolmgreaves force-pushed the mg/merkle_dyn_use_file_backend branch from 291ef08 to 844e985 Compare May 4, 2026 17:47
@malcolmgreaves malcolmgreaves force-pushed the mg/merkle_dyn_pack_interfaces branch from b38fcd2 to c5c8270 Compare May 4, 2026 17:47
@malcolmgreaves malcolmgreaves force-pushed the mg/merkle_dyn_use_file_backend branch from 844e985 to 460483e Compare May 4, 2026 17:58
@malcolmgreaves malcolmgreaves force-pushed the mg/merkle_dyn_pack_interfaces branch from c5c8270 to f12af68 Compare May 4, 2026 17:58
@malcolmgreaves malcolmgreaves force-pushed the mg/merkle_dyn_use_file_backend branch 10 times, most recently from f9502a5 to 7ecfb7f Compare May 5, 2026 00:40
@malcolmgreaves malcolmgreaves force-pushed the mg/merkle_dyn_pack_interfaces branch 2 times, most recently from c65ffd7 to 5813fbf Compare May 5, 2026 00:42
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/lib/src/model/merkle_tree/merkle_transport.rs`:
- Around line 60-71: The trait doc for pack_nodes must require a deterministic,
stable ordering when emitting entries because hashes is a HashSet; update the
doc for fn pack_nodes (and mention PackOptions) to state implementations must
sort the provided HashSet<MerkleHash> into a defined order (e.g., lexicographic
on the MerkleHash bytes or hex) before writing to out: &mut dyn Write so the
tar-gz wire output is stable across runs; also update any existing
implementations to perform that sort when iterating and packing, and mention
OxenError remains the error type for failures.
- Around line 91-101: The unpack contract (fn unpack(&self, reader: &mut dyn
Read, opts: UnpackOptions) -> Result<HashSet<MerkleHash>, OxenError>) must
explicitly require archive-safety: update the doc comment to state that
implementations MUST reject or fail on unsafe tar entries (absolute paths, any
path containing .. that would escape the intended extraction root, and symlinks
that point outside the target store) and MUST sanitize relative paths and
enforce extraction into the store root; then update all implementations to
validate each tar header/path, canonicalize and check it remains within the
target store root, and return an Err(OxenError) when encountering an unsafe
entry rather than writing outside the store (reference UnpackOptions, MerkleHash
and OxenError in the doc so callers and implementers know the expected
behavior).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 668c28ee-ced8-4a14-9554-c98f5d0fa598

📥 Commits

Reviewing files that changed from the base of the PR and between c65ffd7 and 5813fbf.

📒 Files selected for processing (2)
  • crates/lib/src/model/merkle_tree.rs
  • crates/lib/src/model/merkle_tree/merkle_transport.rs
✅ Files skipped from review due to trivial changes (1)
  • crates/lib/src/model/merkle_tree.rs

Comment on lines +60 to +71
/// Pack the given node `hashes` into `out` as a tar-gz stream, in the layout
/// selected by `opts`.
///
/// Hashes not present in the store are silently skipped, and an empty `hashes`
/// produces a valid but empty tarball. See [`PackOptions`] for the per-variant
/// wire-format details.
fn pack_nodes(
&self,
hashes: &HashSet<MerkleHash>,
opts: PackOptions,
out: &mut dyn Write,
) -> Result<(), OxenError>;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

Specify deterministic packing order for stable wire output.

Because input is a HashSet, iteration order is non-deterministic unless implementations sort before emitting entries. Please codify deterministic ordering in the trait docs to avoid unstable tar-gz bytes across runs/processes.

Suggested doc-level clarification
     /// Pack the given node `hashes` into `out` as a tar-gz stream, in the layout
     /// selected by `opts`.
+    ///
+    /// Ordering contract: output must be deterministic for the same logical input
+    /// set (e.g., sort hashes before emitting) rather than relying on `HashSet`
+    /// iteration order.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
/// Pack the given node `hashes` into `out` as a tar-gz stream, in the layout
/// selected by `opts`.
///
/// Hashes not present in the store are silently skipped, and an empty `hashes`
/// produces a valid but empty tarball. See [`PackOptions`] for the per-variant
/// wire-format details.
fn pack_nodes(
&self,
hashes: &HashSet<MerkleHash>,
opts: PackOptions,
out: &mut dyn Write,
) -> Result<(), OxenError>;
/// Pack the given node `hashes` into `out` as a tar-gz stream, in the layout
/// selected by `opts`.
///
/// Ordering contract: output must be deterministic for the same logical input
/// set (e.g., sort hashes before emitting) rather than relying on `HashSet`
/// iteration order.
///
/// Hashes not present in the store are silently skipped, and an empty `hashes`
/// produces a valid but empty tarball. See [`PackOptions`] for the per-variant
/// wire-format details.
fn pack_nodes(
&self,
hashes: &HashSet<MerkleHash>,
opts: PackOptions,
out: &mut dyn Write,
) -> Result<(), OxenError>;
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/lib/src/model/merkle_tree/merkle_transport.rs` around lines 60 - 71,
The trait doc for pack_nodes must require a deterministic, stable ordering when
emitting entries because hashes is a HashSet; update the doc for fn pack_nodes
(and mention PackOptions) to state implementations must sort the provided
HashSet<MerkleHash> into a defined order (e.g., lexicographic on the MerkleHash
bytes or hex) before writing to out: &mut dyn Write so the tar-gz wire output is
stable across runs; also update any existing implementations to perform that
sort when iterating and packing, and mention OxenError remains the error type
for failures.

Comment on lines +91 to +101
/// Unpack the tar-gz stream from `reader` into the store, applying the existing-file
/// policy in `opts`.
///
/// Returns the set of hashes parsed from the tarball (not necessarily only those
/// newly installed — entries skipped per [`UnpackOptions::SkipExisting`] still appear
/// in the result, matching `main`'s `repositories::tree::unpack_nodes` behaviour).
fn unpack(
&self,
reader: &mut dyn Read,
opts: UnpackOptions,
) -> Result<HashSet<MerkleHash>, OxenError>;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Define archive-safety requirements in the unpack contract.

unpack installs filesystem-backed content but the trait contract doesn’t require rejecting unsafe tar entries (absolute paths, .. traversal, symlink escapes). Please make this explicit so all implementations enforce the same security baseline.

Suggested doc-level contract addition
 pub trait MerkleUnpacker: Send + Sync {
     /// Unpack the tar-gz stream from `reader` into the store, applying the existing-file
     /// policy in `opts`.
+    ///
+    /// Security contract:
+    /// - Reject entries that resolve outside the intended node root (absolute paths,
+    ///   `..` traversal, or symlink-based escapes).
+    /// - Reject malformed archive entries whose on-wire path cannot be mapped to a
+    ///   valid `MerkleHash`.
     ///
     /// Returns the set of hashes parsed from the tarball (not necessarily only those
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/lib/src/model/merkle_tree/merkle_transport.rs` around lines 91 - 101,
The unpack contract (fn unpack(&self, reader: &mut dyn Read, opts:
UnpackOptions) -> Result<HashSet<MerkleHash>, OxenError>) must explicitly
require archive-safety: update the doc comment to state that implementations
MUST reject or fail on unsafe tar entries (absolute paths, any path containing
.. that would escape the intended extraction root, and symlinks that point
outside the target store) and MUST sanitize relative paths and enforce
extraction into the store root; then update all implementations to validate each
tar header/path, canonicalize and check it remains within the target store root,
and return an Err(OxenError) when encountering an unsafe entry rather than
writing outside the store (reference UnpackOptions, MerkleHash and OxenError in
the doc so callers and implementers know the expected behavior).

@malcolmgreaves
Copy link
Copy Markdown
Collaborator Author

NOTE: Stacked PR! Must merge #513 before merging.

@malcolmgreaves malcolmgreaves force-pushed the mg/merkle_dyn_use_file_backend branch 2 times, most recently from b36f043 to 30c2b85 Compare May 5, 2026 01:18
@malcolmgreaves malcolmgreaves force-pushed the mg/merkle_dyn_pack_interfaces branch from 5813fbf to 11eacec Compare May 5, 2026 01:18
@malcolmgreaves malcolmgreaves force-pushed the mg/merkle_dyn_use_file_backend branch from 30c2b85 to 4f65285 Compare May 5, 2026 01:27
@malcolmgreaves malcolmgreaves force-pushed the mg/merkle_dyn_pack_interfaces branch from 11eacec to d4d0762 Compare May 5, 2026 01:27
@malcolmgreaves malcolmgreaves force-pushed the mg/merkle_dyn_use_file_backend branch from 4f65285 to 3239a5d Compare May 5, 2026 17:09
@malcolmgreaves malcolmgreaves force-pushed the mg/merkle_dyn_pack_interfaces branch from d4d0762 to 234e31d Compare May 5, 2026 17:09
@malcolmgreaves malcolmgreaves force-pushed the mg/merkle_dyn_use_file_backend branch from 3239a5d to e74a7e6 Compare May 7, 2026 19:24
@malcolmgreaves malcolmgreaves force-pushed the mg/merkle_dyn_pack_interfaces branch from 234e31d to 029e820 Compare May 7, 2026 19:24
@malcolmgreaves malcolmgreaves force-pushed the mg/merkle_dyn_use_file_backend branch from e74a7e6 to 5070203 Compare May 7, 2026 20:05
@malcolmgreaves malcolmgreaves force-pushed the mg/merkle_dyn_use_file_backend branch from 5070203 to ba031b3 Compare May 7, 2026 20:06
@malcolmgreaves malcolmgreaves force-pushed the mg/merkle_dyn_pack_interfaces branch 2 times, most recently from 2bd348f to cbf9b0a Compare May 7, 2026 20:12
@malcolmgreaves malcolmgreaves force-pushed the mg/merkle_dyn_use_file_backend branch from ba031b3 to ff359e0 Compare May 7, 2026 21:40
@malcolmgreaves malcolmgreaves force-pushed the mg/merkle_dyn_pack_interfaces branch from cbf9b0a to df8d6fb Compare May 7, 2026 21:40
@malcolmgreaves malcolmgreaves force-pushed the mg/merkle_dyn_use_file_backend branch from ff359e0 to d152216 Compare May 7, 2026 23:32
Adds new traits for defining how `MerkleStore`s can encode Merkle tree
nodes to send and receive between the `oxen-cli` client and `oxen-server`.
The `MerklePacker` trait collects a subset of nodes and allows a writer to
encode them.

The `MerkleUnpacker` trait uses a provided reader to decode Merkle tree
nodes and update the underlying physical store. Provided that they have the
same `Error` type, combining these two traits is a `MerkleTransport`. A
blanket `impl` for any such unified traits is included here too.

Also defines a `TransportableMerkleStore`, which is a `MerkleStore` + a
`MerkleTransport`. A blanket impl. is provided to unify, so long as the
type's implementation's `Error`s are all the same.
@malcolmgreaves malcolmgreaves force-pushed the mg/merkle_dyn_pack_interfaces branch from df8d6fb to f7950a6 Compare May 7, 2026 23:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant